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Abstract 

We consider the question of the existence of variables with few occur- 
rences in boolean conjunctive normal forms (clause-sets). Let [ivd(F) for a 
clause-set F denote the minimal variable-degree, the minimum of the number 
of occurrences of variables. Our main result is an upper bound ^ivd(F) < 
nM(a(F)) < a(F) + 1 + \og 2 (a(F)) for lean clause-sets F in dependency 
on the surplus <j(F). Lean clause-sets, defined as having no non-trivial au- 
tarkies, generalise minimally unsatisfiable clause-sets. For the surplus we have 
o(F) < 6(F) — c(F) — n(F), using the deficiency 6(F) of clause-sets, the dif- 
ference between the number of clauses and the number of variables. nM(fc) is 
the fc-th "non-Mersenne" number, skipping in the sequence of natural num- 
bers all numbers of the form 2" — 1. As an application of the upper bound 
we obtain that clause-sets F violating (ivd(F) < nM(a(F)) must have a non- 
trivial autarky (so clauses can be removed satisfiability-equivalently by an 
assignment satisfying some clauses and not touching the other clauses). It is 
open whether such an autarky can be found in polynomial time. 



1 Introduction 



We study the existence of "simple" variables in boolean conjunctive normal forms, 
considered as clause-sets. "Simple" here means a variable occurring not very often. 
A major use of the existence of such variables is in inductive proofs of properties 
of minimally unsatisfiable clause-sets, using splitting on a variable to reduce n, the 
number of variables, to n — 1: here it is vital that we have control over the changes 
imposed by the substitution, and so we want to split on a variable occurring as 
few times as possible. The background for these considerations is the enterprise 
of classifying minimal unsatisfiable clause-sets F in dependency on the deficiency 
6(F) := c(F) — n(F), the difference between the number c(F) :— \F\ of clauses 
of F and the number n(F) := |var(F)| of variables of F. The most basic fact is 
8(F) > 1, as first shown in |Q. For 8(F) = 1 the structure is completely known 
@, 0], for 8(F) = 2 the structure after reduction of singular variables (occurring 
in one sign only once) is known (0), while for 6(F) £ {3,4} only basic cases have 
been classified (pl|). 

♦Supported by NSFC Grant 60970040. 
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The starting point of our investigation is Lemma C.2 in ||, where it is shown 
that a minimally unsatisfiable clause-set F must have a variable v with at most 5(F) 
positive and at most 5(F) negative occurrences; we write this as ldi?(w) < 5(F) and 
Id-F (v) < 5(F), using the notion of literal degrees (the number of occurrences of 
the literal). Thus we have v<1f(v) < 25(F), using the variable degree vdp(v) := 
1<1f(v) + ldi?(X)- Using the minimum variable degree (min-var-degree) fivd(F) :— 
min^gvar^) v<1f(v ) of F, this becomes fivd(F) < 25(F). In this article we show 
a sharper bound on fivd(F) for a larger class of clause-sets F. More precisely, we 
show that the worst-cases \dp(v), ld_p(?J) < 5(F) can not occur at the same time (for 
a suitable variable), but actually \dp(v) + ldi?(v) — 5(F) only grows logarithmically 
in 5(F), and this for a larger class of formulas. 

The larger class of clause-sets considered is the class CSAAf of lean clause- 
sets, which are clause-sets having no non-trivial autarky. For an overview on the 
theory of minimally unsatisfiable clause-sets and on the theory of autarkies see 
||. The deficiency 5(F) 6 Z of clause-sets is replaced by the surplus a(F) 6 Z, 
which is the minimal deficiency over all clause-sets F [V] for non-empty variable 
sets V C var(F), where F[V] is obtained from F by removing clauses which have 
no variables in V, and restricting the remaining clauses to V; see [Tlfl for more 
information on the surplus of (generalised) clause-sets. We need to count multiple 
occurrences of clauses here (which might arise during the process of removing literals 
with variables not in V), and thus actually multi-clause-sets F are used here. Note 
that by considering V — var(_F) we have <r(F) < 5(F), and by considering V = {v} 
for v G var(F) we get <j(F) < fxvd(F) — 1. Now the main result of this article 
(Theorem 4.1) is 

Aivd(F) < nM(a(F)) 



for lean F, where nM : N — > N (see Definition 3.1) is a super-linear function with 
nM(fc) < fc+4+log 2 (fc). As an application we obtain (Corollary 4.2), that if a (multi- 
)clause-set F has no variable occurring with degree at most 5(F) + 1 + log 2 (<5(F)), 
then F has a non-trivial autarky. It is an open problem whether such an autarky can 
be found in polynomial time (for arbitrary clause-sets F); we conjecture (Conjecture 
p~3|) that this is possible. 



Related work This article appears to be the first systematic study of the problem 
of minimum variable occurrences in minimally unsatisfiable clause-sets and gener- 
alisations, in dependency on the deficiency, asking for the existence of a variable 
occurring "infrequently" in general, or for extremal examples where all variables 
occur not infrequently. The problem of maximum variable occurrences (asking for 
the existence of a variable occurring frequently in general, or for extremal exam- 
ples where all variables occur not frequently) in uniform (minimally) unsatisfiable 
clause-sets, in dependency on the (constant) clause-length, has been studied in the 
literature, starting with for a recent article see ||. 



Overview In Section g basic notions and concepts regarding clause-sets, autarkies 
and minimal unsatisfiability are reviewed. Section || introduces the numbers nM(fc) 
and proves exact formulas and sharp lower and upper bounds. Section Q contains the 
main results. First in Subsection 4.4 the bound is shown for minimally unsatisfiable 

In Subsection p~2] the bound then is lifted to lean clause- 
is, that if 



First in Subsection 
clause-sets (Theorem |4 



4.4 



The immediate corollary of Theorem 4.4 



sets, proving Theorem 
the asserted upper bound on the minimal variable degree is not fulfilled, then a 
non-trivial autarky must exist (Corollary 4.2). In Subsection 4.3 the problem of 
finding such autarky is discussed, with Conjecture 4.3 making precise our believe 
that one can find such autarkies efficiently. In Section ^| we discuss the sharpness 
of the bound, and the possibilities to generalise it further. Finally, in Section || 
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open problems are stated, culminating in the central Conjecture 6.1 about the 
classification of unsatisfiable hitting clause-sets (or "disjoint tautologies" in the 
terminology of DNFs). 



2 Preliminaries 

We follow the general notations and definitions as outlined in jj| , where also further 
background on autarkies and minimal unsatisfiability can be found. We use N = 
{1,2,...} and N = NU {0}. 

2.1 Clause-sets 

Complementation of literals x is denoted by x, while for a set L of literals we 
define L := {x : x e L}. A clause C is a finite and clash- free set of literals 
(i.e., C H C — 0), while a clause-set is a finite set of clauses. We use var(F) := 
UcGF var ( ( ^') f° r the set 01 variables of F, where var(C) := {var(x) : x 6 C} is the 
set of variables of clause C, while vai(x) is the underlying variable for a literal x. 
For a clause-set F we denote by n(F) := |var(F)| g No the number of variables 
and by c(F) :— \F\ £ No the number of clauses. The deficiency of a clause-set is 
denoted by 5(F) := c(F) — n(F) E Z. We call a clause C full for a clause-set F 
if var(C) = var(F), while a clause-set F is called full if every clause is full. For a 
finite set V of variables let A(V) be the set of all 2^ v \ full clauses over V. Thus full 
clause-sets are exactly the sub-clause-sets of some A(V) . A partial assignment is 
a map ip : V — > {0, 1} for some (possibly empty) set V of variables. The application 
of a partial assignment p to a clause-set F is denoted by ip * F, which yields the 
clause-set obtained from F by removing all satisfied clauses (which have at least 
one literal set to 1), and removing all falsified literals from the remaining clauses. 
A clause-set F is satisfiable iff there is a partial assignment tp with ip* F = T := 0, 
otherwise F is unsatisfiable. All A(V) are unsatisfiable. 

These notions are generalised to multi-clause-sets, which are pairs (F,m), 
where F is a clause-set and m : F — > N determines the multiplicity of the clauses. 
Now c((F,m)) := X^cgf to (^)' wn ile the application of partial assignments tp to a 
multi-clause-set F yields a muZti-clause-set tp * F, where the multiplicity of a clause 
C in tp * F is the sum of all multiplicities of clauses in F which are shortened to C 
by (p. For example if ip is a total assignment for F (assigns all variables of F) which 
does not satisfying F (i.e., <p*F ^ T), then <p*F is ({-L}, (/)ce{J-})i where _L := 
is the empty clause, while / £ N is the number of clauses (with their multiplicities) 
of F falsified by tp. 

For the number of occurrences of a literal a; in a (multi-)clause-set (F, m) we write 
\&f{x) := J2ceF xec m (^)' caue d the literal-degree, while the variable-degree 
of a variable v is defined as yAf{v) := ldF(^) + ldi?(u). A singular variable 
in a (multi-)clause-set F is a variable occurring in one sign only once (i.e., 1 £ 
{ld_F(w), Hf (v)}). A (multi-)clause-set is called non-singular if it does not have 
singular variables. 

For a set V of variables and a mult i- clause- set F by F[V] the restriction of 
F to V is denoted, which is obtained by removing clauses from F which have no 
variables in common with V, and removing from the remaining clauses all literals 
where the underlying variable is not in V (note that this can increase multiplicities 
of clauses). 
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2.2 Autarkies 



An autarky for a clause-set F is a partial assignment ip which satisfies every clause 
C G F it touches, i.e., with var(tp) n var(C) ^ 0. The empty partial assignment 
is always an autarky for every F, the trivial autarky. If ip is an autarky for F, 
then ip * F C F holds, and thus <p * F is satisfiability-equivalent to F. A clause- 
set F is lean if there is no non-trivial autarky for F. A weakening is the notion 
of a matching-lean clause-set F, which has no non-trivial matching autarky, 
which are special autarkies given by a matching condition (for every clause touched, 
a unique variable underlying a satisfied literal must be selectable). The process 
of applying autarkies as long as possible to a clause-set is confluent, yielding the 
lean kernel of a clause-set. Computation of the lean kernel is NP-hard, but the 
matching-lean kernel, obtained by applying matching autarkies as long as possi- 
ble, which is also a confluent process, is computable in polynomial time. Note that 
a clause-set F is lean resp. matching lean iff the lean resp. matching-lean kernel is F 
itself. For every matching-lean multi-clause-set F / T we have 5(F) > 1, while in 
general a multi-clause-set F ^ T is matching lean iff a(F) > 1, where the surplus 
a(F) G Z is defined as the minimum of S(F[V}) for all ^ V C var(F). Note that 
while w.r.t. general autarkies there is no difference between a mult i- clause- set and 
the underlying clause-set, for matching autarkies there is a difference, due to the 
matching condition. For more information on autarkies see jq, [llj . 



2.3 Minimally unsatisfiable clause-sets 

The set of minimally unsatisfiable clause-sets is MU, the set of all clause-sets which 
are unsatisfiable, while removal of any clause makes them satisfiable. Furthermore 
the set of saturated minimally unsatisfiable clause-sets is SMU C MU, which is 
the set of minimally unsatisfiable clause-sets such that addition of any literal to any 
clause renders them satisfiable. We recall the fact that every minimally unsatisfiable 
clause-set F G MU can be saturated, i.e., by adding literal occurrences to F we 
obtain F' G SMU with var(F') = var(F) such that there is a bijection a : F — > F' 
with C C a(C) for all C G F. Some basic properties of MU and SMU w.r.t. the 
application of partial assignments are given in the following lemma. 

Lemma 2.1 For all clause-sets F we have: 

1. F G SMU iff for all v G var(F) and e G {0, 1} we have (v -> e) * F G MU. 

2. If for some variable v holds (v — > 0} * F G SMU and (v — > 1) * F G SMU, 
then F G SMU. 

3. If for some variable v holds (v — > 0) * F G MU and (i)->l)*Fe MU, then 
F G MU. 

For more information on minimal unsatisfiability see Js], |l2| . 



3 Non-Mersenne numbers 



Splitting on variables with minimum occurrence in minimally unsatisfiable clause- 
sets leads by Theorem 4.5 to the following recursion. The understanding of this 
recursion is the topic of t his section. On a f irst reading, only Definition |3.1| and the 
main results, Lemma p.q and Corollary 3.9, need to be considered. 
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Definition 3.1 For k £ N let nM(fe) := 2 ifk = l, while else 



nM(fc) := max min(2 • i, nMffc - i + 1) + i). 

te{2,...,fe} 



Remarks: 



1. This is sequence xttp : //oeis . org/A062289 in the "On-Line Encyclopedia 



of Integer Sequences" . It can be defined as the enumeration of those natural 
numbers containing the string "10" (at consecutive positions). The sequence 
leaves out exactly the number of the form 2" — 1 for n £ N, and thus the 
name. The sequence consists of arithmetic progressions of slope 1 and length 
2 m — 1, to = 1, 2, ... , each such progression separated by an additional step 



of +1. The recursion in Definition 3.1 is new, and so we can not use these 



characterisations, but must directly prove the basic properties. 

2. The value of nM(fc) for k = (1), (2, 3, 4), (5, . . . , 11), (12, . . . , 26) is (2), (4, 5, 6), 
(8,..., 14), (16,..., 30). 

3. For k > 2 we have nM(fc) > 4. This holds since nM(2) = 4, while the 
induction step for k > 3 is nM(fc) = max.j g { 2 ,...,fc} min(2z, nM(fc — i + 1) +i) > 
min(4, min(4 + 2,1 + 3))= 4. 

4. By induction and by definition we have k + 1 < nM(fc) < 2 • k for k £N. 

For a sequence a : N — >• M and k £ N let Aa(fc) := a(k + 1) — a(k) be the step 
in the value of the sequence from k to k + 1. The next number in the sequence of 
non-Mersenne numbers is obtained by adding 1 or 2 to the previous number: 



Lemma 3.2 For k £ N holds AnM(fc) £ {1,2}. 

Proof For k = 1 we get AnM(l) = 2. Now consider k > 2. We have 

nM(fc + 1) = max(min(4, nM(fc) + 2), max ie{3 fc+1} min(2i, nM(fc -i + 2)+i)) = 

maxj 6 | 3j fc+1 | min(2i, nM(fc — i + 2) + i) — max ie | 2 ,...,fc} min(2(« + l),nM(fc — 
(i + 1) + 2) + (i + 1)) = maxj e { 2 ,... 1 fe} min(2i + 2,nM(fc - i + 1) + i + 1) 
1 + raaXj e /2,...,fe} rnin(2i + 1, nM(fc — i + 1) + *)• 

Thus on the one hand we have nM(fc + 1) > 1 + maxj e { 2j ... ) fe} min(2i, nM(fc — i + 
1) + i) = 1 + nM(fe), and on the other hand nM(A: + 1) < 1 + rnax ig { 2 ,...,fc} niin(2i + 
l,nM(A;-i + l) + i+l) = 2 + nM(fc). I 



Corollary 3.3 nM : N — > N is strictly increasing. 



Corollary 3.4 We have nM(a + b) > nM(a) + b for a £ N and b £ N , and thus 
nM(a -b) < nM(o) -b for b < a. 

Instead of considering the maximum over k — 1 cases i £ {2,...,fc} to compute 
nM(fc), we can now simplify the recursion to only one case i(k) £ {2, . . . , k}, and 
for that case also consideration of the minimum is dispensable: 

Lemma 3.5 For k £ N, k > 2, let i(k) £ N be the smallest i £ {2, . . . , k} with 
i > nM(fc — i + 1) (note that k > nM(fc — k+ 1) = 2, and thus i{k) is well-defined). 
For example we have i(2) — 2, i(3) = 3, i(4) = 4 and i(5) — 4. Then we have: 

1. i(k) - nM(k - i(k) + 1) < 2. 

2. nM(jfc) = nM(fc - i(jfc) + 1) + i(k). 

3. Ai(k) £ {0,1}. 
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Proof We have i(k) = 2 iff k = 2, while for k = 2 the assertions hold trivially; 
so assume k > 3 and i(k) > 3. Part |l| follows by Lemma 3.2 from the facts 



that the sequence i G {2, . . . , k} <— > i moves up in steps of +1, while the sequence 
i G {2, . . . , k} h4 nM(fc — i + 1) moves down in steps of —1 or —2. It remains to 



show Part |2j. By Lemma 3.2 the sequence i G {2,...,k} nM(fc — i + 1) + i is 
monotonically decreasing, and thus by definition we obtain nM(fc) = max(2 ■ (i(k) — 
1), nM(fc — i(k) + 1) + i(k)). That the maximum here is actually always attained in 
the second component follows by Part [l]. Finally Part ^| follows again from Lemma 
1 

After these preparations we are able to characterise the "jump positions" , the 
set J C N of k 6 N with AnM(fc) = 2. Thus AnM(k) = 1 iff k g J, and 
J = {1,4,11,26,...}. Note nM(fc) = 1 + k+ \{k' G J : k' < k}\. 

Lemma 3.6 Let i'(k) := k-i(k) + l and h(k) := nM(i'(fc)) for k G N, k > 2. Thus 
Ai'(k) G {0,1} and Ai(k) = 1 — Ai'(k). Furthermore we havenM(k) = h(k) + i(k), 
thus AnM(fc) = Ah(k) + Ai{k), and i(k) - h(k) G {0, 1,2}. Consider k>2. 

1. IfAi(k) = 0, then: 

(a) Ai(k + 1) = 1 

(b) i(k) ^ h(k). 

(c) i(k + l) = h(k + l). 

2. IfAi(k) = 1, then: 

(a) Ah(k) — 0, and so k £ J 

(b) i(k) h(k)+2. 

3. The following conditions are equivalent: 

(a) k G J 

(b) Ah{k) = 2 

(c) i(k) = h(k)+2 

(d) Ai(k - 1) = 1 and i(k -l) = h(k-l) + l 

(e) Ai(k - 2) = Ai(k- 1) = 1 

(f) i'(k) = i'(k - 1) = i'(k - 2) and i'(k) G J. 

4- If k G J, then i'(k) = max(fc' G J : k' < k). 



Proof Part |la| follows by definition. For Part lb note i(k + 1) = i(k) while 
h(k + 1) > h(k) + 1. For Part assume i(k + 1) > h(k + 1). Then we have 
i(k) = h(k) + 2 and h(k + 1) = + 1. However then i(k) - 1 = fr(Jfe) + 1 = 
h(k + 1) = nM(fc — (i(k) — 1) + 1) contradicting the definition of i(k). For Part pa] 
assume i(fc) = z(fe + l) = i(fc+2). We have i(k) > h(k + 2) = nM(fc-i(fc)+3), while 
i(k) - 1 < nM(fc - (i{k) - 1) + 1) = nM(fc - i(k) + 2), i.e., i(k) < nM(fc - i(k) + 2), 
contradicting the strict monotonicity of nM. Part [2b| follows by i(k+l) < h(k+l)+2 
and i(k + 1) = i(jfc) + 1, h(k + 1) = /i(fc). Now consider Part |[ 

Condition [3a] implies condition |^ due to Ai(k) — in case of k G J by Part pa| . 
Condition ^ implies condition since Ah(k) = 2 implies Ai(k) = (otherwise 
we had AnM(fe) = 3), and so by Part ^ we have i(k) = i(k + 1) = h{k + 1), while 
the assumption says h{k + 1) = h(k) + 2. In turn condition |3^ implies condition 
|3a] , since by Part ^ we get Ai(k) = 0, and thus AnM(fc) = Ah(k), while in case 
of Ah(k) < 1 we would have i(k) — 1 > nM(k — (i(k) — 1) + 1) contradicting the 
definition of i(k), due to nM(fc - (i(k) - 1) + 1) = nM((fc + 1) - i(k + !) + !) = 
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h(k + 1) < h(k) + 1 = i(k) — 1. So now we can freely use the equivalence of these 
three conditions. 

Condition 3c| implies condition [id|, since we have Ai(k) = 0, and thus Ai(k — 1) = 



1 with Part [La, from which we furthermore get i{k) = i(k— 1) + 1 and h(k— 1) = 



and so — 1) = i(k) — 1 = /i(fc) + 1 = /i(fc — 1) + 1. Condition 3d implies condition 
Ba , since in case of Ai(k — 2) = we had i(k — 1) = /i(fc — 1) with Part [k]. In turn 
condition |3c] implies condition |3c| since i(k) = i(k — 1) + 1 = — 2) + 2, while 
h(k) — h(k — 1) = /i(fc — 2), where by definition i(k — 2) > /i(fc — 2) holds, whence 
i(k) > h(k) + 2, which implies i(k) — h(k) + 2. So now the first five conditions have 
been shown to be equivalent. 

Now condition |3c| implies condition since it only remains to show i'(k) G J, 
which follows with condition ^ (using Ai(k) =0). In turn condition [}f| implies 
immediately condition |3e| 

Finally, we prove Part ^ by induction on k (regarding the enumeration of J). 
We have i'(4) = 1, and so the induction holds for k — 4, the smallest jump position 
k > 2. Now assume that the assertion holds for all elements ofJn{l,...,fc — 1}, 
where k > 4, and we have to show the assertion for k. By Part ^| we know z'(fc) 6 J, 
where 2 < i'(fc) < k. Assume there is k 1 G J with i'(fc) < k' < k. Now by induction 
hypothesis we get i'(fc) < i'(fc') < k'. However by Part [l| we get Az'(fc') = 1, and 
thus i'(k) > i'(k') (since k > k'). I 

Corollary 3.7 W^e have J = {2 m+1 — to — 2 : m G N}. 

Proof Let k m for to G N be the mth element of J; so the assertion is k m = 
2 m+1 — m — 2. We have fci=4 — 1 — 2 = 1 = min J; in the remainder assume 
to > 2. We prove the assertion by induction, in parallel with i(k m ) = 2 m+1 — 2 m . 
For to = 2 we have ^2 = 8 — 2 — 2 = 4 = min J \ { 1 } , while i (4) is the smallest 
i G {2, 3, 4} with i > nM(5 - i), which yields i(4) = 4 = 2 3 - 2 2 . Now we consider 
the induction step, from m — 1 to to. The ind ucti on hypothesis yields k m -i — 
2 m — to — 1 and i(k m -i) = 2 m - 2 m ~ 1 . Lemma 0, Part | yields i'(k m ) = k m - X , 
from which by i'(k m ) = k m — i(k m ) + 1 follows k m = 2 m — m — 2 + i(k m ). By 
definition we get i{k m ) ~ Ai(k m — 1) + • • • + Ai(fc m _i) + i(fc TO _i). By Lemma |3.6| , 
Parts |l| - H the sequence of A- values has the form (starting with the lowest index) 
0, 1, 0, 1, ... , 0, 1, 1, and thus their sum has the value \{k m — fc m _i — 1) + 1. So we 
get i(k m ) = i(fc m -fc m _i-l) + l + i(fc m _ 1 ) = i(2 m -TO-2 + i(fc m )-2 m + TO+l- 
1) + 1 + 2 m - 2 m ~ 1 = \i(k m ) - 1 + 1 + 2 m - 2 m ~ 1 , from which i(k m ) = 2 m+1 - 2 m 
follows. Finally k m = 2 m - to - 2 + 2 m+1 - 2 m = 2 m+1 - to - 2. I 

Now the closed formula for nM(fc) can be proven (using ld(x) := log 2 (a;)): 

Lemma 3.8 Fork G N let Rd(k) := [ld(fc)J (''floor of logarithm dualis"). Then we 
have for k G N the equality nM(fc) = k + M(k + 1 + fld(A: + 1)). 

Proof Let g(k) := M(k + 1 + fld(fc + 1)) and f(k) := k + g(k) (so nM(fc) = f(k) is 
to be shown, for k > 1). We have /(l) = l + fld(2 + fld(2)) = l+fld(3) = 2 = nM(l). 
We will now prove that the function g{k) changes values exactly at the transitions 
k i — y — | — 1 for k G J, that is, for indices k — k m :— 2 m+1 — to — 2 (using Corollary 



3.7) with to G N we have Ag(fc m ) = 1, while otherwise we have Ag(k m ) = 0, from 
which the assertion follows (by the definition of J). 

We have g(l) = 1 and g(2) = 2. Now consider to G N and k m + 1 < k < fc m +i. 
We show g(k) = m + 1, which proves the claim. Note that g{k) is monotonically 
increasing. Now g(k) > g(k m + l) = Lld(2 m+1 - m+ Lld(2 m+1 - m)J )J = [ld(2 m+1 - 
to + to)J = m + 1 and.g(fc) < g(k m+1 ) = Lld(2" l + 2 - to - 2 + Lld(2 m + 2 - to - 2)J )J < 
Lld(2 m + 2 - to-2 + to + 1)J = Lld(2 m + 2 - 1)J = m + 1. I 

As a result, we obtain very precise bounds: 
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Corollary 3.9 k + M(k + 1) < nM(k) <k+l + Rd(k) holds for k £ N. 

Proof The lower bound follows trivially. The upper bound holds (with equality) 
for k < 2, so assume k > 3. We have to show g(k) = fid(fc+l+fld(fc+l)) < l+fld(fc), 
which follows from ld(fc + 1 + fld(fc + 1)) < 1 + ld(fc). Now ld(fc + 1 + Qd(k + 1)) < 
ld(fc + 1 + ld(fc + 1)) < ld(fc + *) = ! + ld(/c). I 



4 Lean clause-sets and the surplus 



In this section we prove the main result of this paper, Theorem LI. The proof 
consists in first handling a speci al ca se, minimally unsatisfiable clause-sets instead 
of lean clause -set s, in Subsection 4.1, and then lifting the result to the general case 
in Subsection 4.2. In Subsection 4.3 we consider the algorithmic implications of this 
result. 



Theorem 4.1 We have [ivd(F) < nM(a(F)) for a lean multi-clause-set F with 
n(F) > 0. More precisely, there exists a variable v £ var(_F) with vd F (v) < 
nM(<r(F)) and ld^(w), ld F («) < er(F). 

We obtain a sufficient criterion for the existence of a non-trivial autarky. 

Corollary 4.2 Consider a multi-clause-set F with n(F) > 0. If a(F) < 0, then 
F has a non-trivial matching autarky. So assume o~{F) > 1. If we have fivd(F) > 
nM(cr(_F)), then for every ^ V C v&r(F) with S(F[V]) = o~{F) we have an autarky 
ip for F with varf^) = V (and thus F has a non-trivial autarky). 

The quantities /j,vd(F) and nM(o-(F)) (resp. nM(S(F))) are computable in poly- 



nomial time, and so the applicability of Corollary 4.2 can be checked in polynomial 
time. We conjecture that also "constructivisation" of Corollary 4.2 can be done in 
polynomial time: 



Conjecture 4.3 There is a poly-time algorithm for computing a non-trivial autarky 
in case of fj,vd(F) > nM(a(F)) (or /j,vd(F) > xiM.(S(F))) for matching-lean clause- 
sets F. 



See Subsection 4.3 for more discussion on Conjecture 4.3 (there also the remaining 
details of Corollary 4.2 are proven). 



4.1 The special case of minimally unsatisfiable clause-sets 

The main auxiliary lemma is the following statement, which receives its importance 
from the fact that every minimally unsatisfiable clause-set can be saturated (this 
method was first applied in ||). 

Lemma 4.4 Consider F £ SM.Us=k for k £ N and a variable v £ vav(F) realising 
the minimal var-degree (i.e., vdp(v) — ^vd(F)). Using too := ldp{v) and toi := 
ldi?(i>) we have (v — > e) * F £ MUk-m E +i for e £ {0, 1}, where n((v —> e) * F) — 
n(F) — 1. Since minimally unsatisfiable clause-sets have deficiency at least one, we 
get m e < k. 

Proof We have n((v — > e) * F) = n(F) — 1 since F contains no pure variable, 
while v realises the minimum of var-degrees. Thus 8((v — > e) * F) = 8(F) — m e + 1, 
while (v ->• e) * F £ MU by Lemma [DJ Part |l|. I 
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Theorem 4.5 For all k G N and F £ MUs<k we have /ivd(F) < nM(fc). More 
precisely, for n(F) > there exists a variable v G var(F) wife vdj?(u) < nM(fc) and 
ld F (w),ld F (iJ) < k. 

Proof The assertion is known for k — 1, so assume A; > 1, and we apply induction 
on k. Assume S(F) — k (due to k > 1 we have n(F) > 1). Saturate F and obtain F'. 
Consider a variable i> G var(F') realising the min-var-degree of F'. If vdf'(v) = 2 
then we are done, so assume vd F '(v) > 3. Let i := ma x(ld f (v), ldp> (v)); so 



vd.F'(v) < 2i. W.l.o.g. assume that i — ld F /(v). By Lemma [4.4j we get 2 < i < k 



Applying the induction hypothesis and Lemma iA we obtain a variable w G var(G) 
for G := (v — ► 1}*F with vdc(w) < nM(fc — i + By definition we have vd,F<(u>) < 
vdaiw) + ld F >(v). Altogether we get fivd(F) < min(2i, nM(fc — i + 1) +i) < nM(fc). 
I 

It is interesting to generalise Theorem L5 for generalised clause-sets (see jll], |l2| 
for a systematic study, and |l0| for the underlying report). Generalised clause-sets 
have literals "v ^ e" for variables v with domains D v and values e G D v , and the 
deficiency is generalised by giving every variable a weight \D V \ — 1 (which is 1 in the 
boolean case). The base case of deficiency k = 1 is handled in Lemma 5.4 in |l2| ], 
showing that for generalised clause-sets we have here ^vd(F) < maXj ) g var (i?)|-Djj|- 
But k > 2 requires more work: 

1. The basic method of saturation is not available for generalised clause-sets, as 
discussed in Subsection 5.1 in Q. Thus the proofs for the boolean case seem 
not to be generalisable. 

2. Stipulating the effects of saturation via the "substitution stability parameter 
regarding irredundancy" , in Corollary 5.10 in Jl2] | one finds a first approach 
towards generalising the basic bound [ivd(F) < 26(F) (for the boolean case) 
by ^vd(F) < max„ 6var ( i r)|Z> !) | • 6(F). 

3. Another approach uses translations to boolean clause-sets. The "generic trans- 
lation scheme" (see || [l^] ) allows (for certain instances) to preserve the defi- 
ciency and the other structures relevant here. So we get general upper bounds 
for the minimum number of occurrences of variables in generalised clause-sets 
from the boolean case. But further investigations are needed in these bounds. 

4.2 Proof of the general case 

Now consider an arbitrary (multi-)clause-set F. Consider a set of variables =t V C 
var(F) realising the surplus of F, i.e., such that 6(F[V}) is minimal. If F[V] would 
be satisfiable, then a satisfying assignment would give a non-trivial autarky for F. 
Assuming that F is lean thus yields that F[V] must be unsatisfiable. So there exists 
a minimally unsatisfiable F 1 C F[V]. If now var(F') ^ var(F[V]) = V would be 
the case, then we would loose control over the deficiency of F' . Fortunately this 
can not happen, as the following lemma shows. 

Lemma 4.6 Consider a multi- clause- set F with o~(F) = 6(F). Then for every 
unsatisfiable sub-multi- clause- set F' < F we have var(F') = var(F). 

Proof Assume var(F') C var(F), and consider a minimally unsatisfiable sub- 
clause-set F" CP. By definition we have 6(F") + <5(F[var(F) \ var(F")]) < 6(F), 
where c5(F[var(F) \ var(F")]) > a(F) = 6(F), from which we conclude 6(F") < 0, 
but 6(F") > 1 must hold since F" is minimally unsatisfiable. I 



Finally we are able to prove Theorem 4.1. Recall that F is a lean multi-clause 



set with n (F) > 0, and we have to show the existence of a variable v with vdf(u) < 
nM(cr(F)) and ld F (v),\d F (v) < a(F). 
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Consider ^ V C var(F) with <5(F[V]) = er(F), and let F' := F[V]. F' is 
unsatisfiable, since F is lean. Because of S(F') = cr(F) we have S(F') = cr(F'). 
Consider some minimally unsatisfiable F" C F'. By Lemma [O] we have var(F") = 
var(F'). So we get S(F") = S(F') - (c(F') - c(F")). By Theorem g| there is v G 
var(F") with vdjr//(u) < nM(5(F")) = vM(5(F') - (c(F') - c(F"))) < nM(S(F')) - 
(c(F') - c(F")) and ld F „(v), ld F »(v) < = - ( c (^') - c(F")). Finally 

we have vd F (v) < vd F »(v) + (c(F') — c(F" j) (note that all occurrences of v in F 
are also in F'), and similarly for the literal degrees. QED 

Corollary 4.7 For a lean multi-clause-set F with n(F) > we have /ivd(F) < 
nM(5(F)). 

Corollary 4.8 Consider a lean multi-clause-set F. 

1. <j(F) = 1 holds if and only if fj,vd(F) = 2 holds. 

2. fivd(F) = 3 implies a(F) = 2. 



Proof First consider Part |l|. If o~(F) — 1 (so n(F) > 0), then by Theorem 4.1 



we have fivd(F) < nM(l) = 2, while in case of fivd(F) = 1 there would be a 
matching autarky for F. If on the other hand fivd(F) — 2 holds, then by definition 
a(F) < 2 — 1 — 1, while a(F) > 1 holds since F is matching lean. For Part || note 
that due to o-(F) + 1 < /ivd(F) we have cr{F) < 2, and then the assertion follows 
by Part |. I 
Remarks: 

1. If F is lean, then a(F) = 2 implies fivd(F) G {3, 4}. An example for fivd(F) = 
4 is given by the full unsatisfiable clause-set with 2 variables. 

2. Is there a minimally unsatisfiable F with fivd(F) — 4 and <j(F) = 3? 

3. More generally, is there for every k G N a minimally unsatisfiable F with 
a(F) = k and fivd(F) = fc + 1? 

4.3 On finding the autarky 



The following lemma (with Theorem |4.l|) yields the proof of Corollary \ fl 



Lemma 4.9 Consider a matching-lean multi-clause-set F with n(F) > 0. // we 
have [ivd(F) > nM(a(F)), then all F[V] for c V C var(F) with 5(F[V]) = a(F) 
are satisfiable. 



Proof If some F[V] would be unsatisfiable, then by the proof of Theorem iA 



Subsection L2 there would be a variable v with vdp(v) < nM(cr(F)). I 

Now consider a matching-lean multi-clause-set F with n(F) > 0, where Corol- 
lary |4^ is applicable (recall that we have cr(F) > 1), that is, we have fj,vd(F) > 
nM(cr(F)). So we know that F has a non-trivial autarky. Conjecture |4.3| states that 
finding such a non-trivial autarky in this case can be done in polynomial time (recall 
that finding a non-trivial autarky in general is NP-complete, which was shown in 
§)• 

The task of actually finding the autarky can be considered as finding a satisfy- 
ing assignment for the following class A4CC1Z C SAT n A4C£ AN of satisfiable(l) 
clause-sets F, obtained by considering all F[V] for minimal sets of variables V with 
5{F[V}) = o-(F) (where "CR" stands for "critical"): 



10 



Definition 4.10 Let be the class of clause-sets F fulfilling the following 

three conditions: 



1. F is matching-lean, has at least one variable, and does not contain the empty 
clause. 

2. The only ^ V C var(F) with S(F[V}) = a(F) is V = var(F) (and thus we 
have 5(F) = a(F)). 

3. fivd(F) > nM(o-(F)). 

It is sufficient to find a non-trivial autarky for this class of satisfiable clause-sets. 



Lemma 4.11 Conjecture ^.j is equivalent to the statement, that finding a non- 
trivial autarky for clause-sets in A4CCTZ can be achieved in polynomial time. 

At the time of writing this article, we are not aware of elements of AiCCTZ with a 
deficiency at least 2. 



5 On strengthening the bound 

For a class C of clause-sets let /ivd(C) be the supremum of ^ivd(F) for F E C 
with n(F) > 0. So by Theorem fL5] we have fJ,vd(MUs=k) < nM(fc) for all ft E 
N. The task of precisely determining /ivd(MUs=k) for all k will be pursued in 
the forthcoming 1^| ; we need more theory for minimally unsatisfiable clause-sets 
(especially for unsatisfiable hitting clause-sets), and so here we can only mention 
some results connected with this article. 



• We can show for infinitely many k that /j,vd(M.Us=k) — nM(fc). 

• We can also show that the smallest k where we don't have equality is k = 6, 
namely fj,vd(MU s = 6 ) = 8 = nM(6) - 1. 



• Let nMi : N — »■ N be defined by the recursion as in Definition 3.1, however 
with different start values, namely nMi(fe) := nM(fc) for 1 < k < 5, while 
nMi(6) := nM(6) -1 = 8. We have nMi(fc) = nM(fc) for k <£ {2 m - m + 1 : 
to E N, to > 3}, while for k = 2 m - to + 1 we have nMi (fc) = nM(fe) - 1 = 2 m . 

• With the same proof as for Theorem |4.5| we can show fxyd(M.Us=k) < nMi(fc) 
for all k E N. 

• It seems that this bound can not be generalised to lean clause-sets (as in 



Theorem 4.1) 



Conjecture 5.1 For all k eN we have (ivd(MUs=k) > nM(fc) — 1. 

Now we consider the question whether the bound holds for a larger class of 



clause-sets, that is, whether Theorem 4.1 can be generalised further, incorporating 



non-lean clause-sets. We consider the large class AiCSAAf of matching lean clause- 
sets, as introduced in JjJ, which is natural, since a basic property of F E MU used 



in the proof of Theorem 4.1 is 5(F) > 1 for F ^ T, and this actually holds for 
all F E M.CZAM . We will construct for arbitrary deficiency k E N and K E N 
clause-sets F E M.CSAN of deficiency k where every variable occurs positively at 
least K times. Thus neither the upper bound max(ldi?(u), ldp(v)) < f(5(F)) nor 
ldp(v) +ldp(v) = vdp(v) < f(5(F)) for some chosen variable v and for any function 
/ does hold for MCSAM. 
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An example for F E MC£ANs=i with /zld(F) > 2 (and thus /xvd(F) > 4) is 
given in Section 5 in || , displaying a "star- free" (thus satisfiable) clause-set F with 
deficiency 1. In Subsection 9.3 in jll| it is shown that this clause-set is matching 
lean. "Star-freeness" in our context means, that there are no singular variables 
(occurring in one sign only once). Our simpler construction pushes the number of 
positive occurrences arbitrary high, but there are variables with only one negative 
occurrence (i.e., there are singular variables). 

For a finite set V of variables let M(V) C A(V) be the full clause-set over V 
containing all full clauses with at most one complementation. Obviously 5(F) = 1 
holds, and it is easy to see that M(V) E MCE AM (for every ^ F' C F C A(V) 
we have S(F') < 5(F), and thus a full clause-set F is matching lean iff 5(F) > 1). 
Furthermore by definition we have 1<1m(v)( v ) — 1^1 an d ldM(v)C^) = 1 f° r v £ V- 

Lemma 5.2 For k E N and K E N there are clause-sets F E MC£AN& = k such 
that for all variables v E ybx(F) we have \&p(v) > K . 

Proof For k — 1 we can set F := M({vi, . . . ,vk})', so assume k > 2. Consider 
any clause-set G E MC£AMs=k-i with n := n(G) > K (for example we could use 
F E MUs=k~i), and let V := var(G). Consider a disjoint copy of V, that is a set 
V' of variables with V' D V — and \V'\ — \V\, and consider two enumerations of 
the clauses M(V) = {d, C„+i}, M(V) = {C[, C' n+1 }. Now 

F:=Gu{CiUCj:*G{l,...,n + l}} 

has no matching autarky: If tp is a matching autarky for F, then var(iy9) n V = 
since G is matching lean, whence var^nF' = since M(V) is matching lean, and 
thus ip must be trivial. Furthermore we have n(F) = 2n and c(F) = c(G) + n + 1, 
and thus (5(F) = c(G) + n + 1 — 2n = <5(G) + 1 = k. By definition for all variables 
v E var(F) we have ldpfw) > n. I 
Remarks: 

1. It remains open whether for deficiency fc £ N we find examples F E J^iC£AAfs=k 
with fild(F) > k+1 (the above mentioned star- free clause-sets shows that this 
is the case for k = 1), or stronger, /Ad(F) > K for arbitrary K E N. 

2. The clause-sets F constructed in Lemma |5.2| are not elements of AiCCTZs=k 
for k > 2, sinceJ[F[V r ']) = n + 1 — n = 1, thus a(F) = 1, and so Condition 
H of Definition 4. 10| is not fulfilled. The corresponding autarky is a satisfying 



assignment of M(V'), which is easy to find. 



6 Conclusion and open problems 



We have shown the upper bound fivd(F) < nM(a(F)) for lea n cla use-sets (Theo rem 



1). The function nM(fc) has been characterised in Lemma 3J3 and Corollary 3.9 . 
We presented first initial results regarding the sharpness of the bound and regarding 
the constructive aspects of the bound (i.e., what happens if the bound is violated). 
There remain several open problems: 



1. Prove Conjecture 4.3, which says that such an autarky, which must exist if a 



clause-set does not fulfil the upper bound on the minimum variable degree of 



Theorem 4.1, can be found in polynomial time. See Subsection 4.3 for more 



information on this topic. 



2. Generalise Theorem 4.5 to clause-sets with non-boolean variables; see the 



discussion after Theorem 4.5 
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3. See the remarks to Corollary 4.8 (an underlying question is to understand 
better the quantity "surplus"). 



4. Strengthen the bound on the minimum variable degree for minimally unsat- 
isfiable clause-sets (see the forthcoming fl3||). 

5. Strengthen the construction of Lemma |5.2| (perhaps completely different con- 
structions are needed). 

As mentioned in the introduction, a major motivation for us is the project of 
the classification of minimally unsatisfiable clause-sets for deficiencies 5 = 1,2,.... 
Especially the classification of unsatisfiable hitting clause-sets in dependency on the 
deficiency seems very interesting (recall that a hitting clause-set F is defined by the 
condition that every two clauses C,C € F, C ^ C , clash in at least one variable, 
that is |C n C"| > 1). The main conjecture is: 

Conjecture 6.1 For every deficiency k £ N there are only finitely many isomor- 
phism types of non-singular unsatisfiable hitting clause-sets. 

For k < 2 this conjecture follows from known results, while recently we were able 
to prove it for k = 3. 
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